Vizer: A System to Vectorize Intel x86 Binaries
نویسندگان
چکیده
Traditional compilers conduct optimizations on intermediate representations derived from high level source code. However, it is sometimes necessary and fruitful to optimize executables or compiled object files. This paper describes the Vizer system which automatically vectorizes object code for the Intel x86 architecture. Binary optimization offers the opportunity to improve performance in situations where the optimizer cannot have access to the source code. A binary optimizer can analyze and modify compiled code to increase performance. It can also edit the code to insert instrumentation and sensor code that lets other tools observe the running program’s behavior. While opportunities for binary optimization arise in many contexts, our interest is motivated from two distinct sources. In the Grid Application Development Software (GrADS) project, we are developing tools that optimize an executable version of the program after resources have been chosen and a mapping from the problem to those resources has been established. The technology developed in Vizer is part of the GrADS compilation system. Our second motivation comes from the opportunities present in legacy binaries—executable programs that we cannot recompile. The Vizer technology gives us a way to apply code optimization techniques directly to these programs and to rewrite them in ways that take advantage of new hardware features such as those on the Intel x86 line of processors. The widespread use of the Intel x86 architecture make it an attractive target for binary optimization. Additionally, Intel regularly adds new features to new models of this line. For example, recent Pentiums support Single Instruction Multiple Data (SIMD) instructions called the Multimedia Extensions, Streaming SIMD (SSE), and Streaming SIMD 2 (SSE2) that operate on packed integer and floating point data. At the same time, however, these machines are backwards-compatible with older models in the line. Thus, code compiled for older processors such as the 80386 can run on the latest Pentium processors. Because of its popularity, there exists a large base of applications originally compiled for older x86 processors. Binary optimization offers a way to update these legacy applications so that they make good use of the modern features available on the latest Pentium models. The Vizer system demonstrates the potential for such improvement by implementing an object-level vectorizer that has the potential to dramatically reduce execution time for some codes. Although vectorization of programs written in higher-level languages has been well studied [1, 10, 6], vectorizing low-level assembly code poses a number of additional challenges. A source-level
منابع مشابه
Towards Reconstructing Architectural Models of Software Tools by Runtime Analysis
We present a method and initial results on reverse engineering the architecture of monolithic software systems. Our approach is based on analysis of system binaries resulting in a series of models, which are successively refined into a component structure. Our approach comprises the following steps: 1) instrumentation of existing binaries for dynamically generating execution traces at runtime a...
متن کاملDyVSoR: dynamic malware detection based on extracting patterns from value sets of registers
To control the exponential growth of malware files, security analysts pursue dynamic approaches that automatically identify and analyze malicious software samples. Obfuscation and polymorphism employed by malwares make it difficult for signature-based systems to detect sophisticated malware files. The dynamic analysis or run-time behavior provides a better technique to identify the threat. In t...
متن کاملBuilding applications for the Linux Standard Base
C. Yeoh The goal of the Linuxe Standard Base (LSB) is to develop and promote a set of standards that will increase compatibility among Linux distributions and enable software applications to run on any compliant Linux system. There are currently LSB specifications available for the Intel Architecture IA-32e processors and for the 32and 64-bit PowerPCe, Itaniume, 31and 64-bit zSeriese, and AMD64...
متن کاملPerformance Evaluation of Intel EPT Hardware Assist
For the majority of common workloads, performance in a virtualized environment is close to that in a native environment. Virtualization does create some overheads, however. These come from the virtualization of the CPU, the MMU (Memory Management Unit), and the I/O devices. In some of their recent x86 processors AMD and Intel have begun to provide hardware extensions to help bridge this perform...
متن کاملPost Link-Time Optimization on the Intel IA-32 Architecture
Post link-time optimization of executables has been investigated by several projects in recent years. These optimization systems have targeted RISC architectures like the Compaq Alpha, and have shown that there is considerable room for improvement in compiler-generated code. Classical compiler optimizations like constant propagation, function inlining, and dead code elimination have been shown ...
متن کامل